NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multimodal Fusion of Smart Home and Text-based Behavior Markers for Clinical Assessment Prediction

https://doi.org/10.1145/3531231

Sprint, Gina; Cook, Diane J.; Schmitter-Edgecombe, Maureen; Holder, Lawrence B. (October 2022, ACM Transactions on Computing for Healthcare)

New modes of technology are offering unprecedented opportunities to unobtrusively collect data about people's behavior. While there are many use cases for such information, we explore its utility for predicting multiple clinical assessment scores. Because clinical assessments are typically used as screening tools for impairment and disease, such as mild cognitive impairment (MCI), automatically mapping behavioral data to assessment scores can help detect changes in health and behavior across time. In this article, we aim to extract behavior markers from two modalities, a smart home environment and a custom digital memory notebook app, for mapping to 10 clinical assessments that are relevant for monitoring MCI onset and changes in cognitive health. Smart-home-based behavior markers reflect hourly, daily, and weekly activity patterns, while app-based behavior markers reflect app usage and writing content/style derived from free-form journal entries. We describe machine learning techniques for fusing these multimodal behavior markers and utilizing joint prediction. We evaluate our approach using three regression algorithms and data from 14 participants with MCI living in a smart-home environment. We observed moderate to large correlations between predicted and ground-truth assessment scores, ranging from r = 0.601 to r = 0.871 for each clinical assessment.
more » « less
Full Text Available
Automated Cognitive Health Assessment Using Partially Complete Time Series Sensor Data

https://doi.org/10.1055/s-0042-1756649

Thomas, Brian L.; Holder, Lawrence B.; Cook, Diane J. (September 2022, Methods of Information in Medicine)

Abstract Background Behavior and health are inextricably linked. As a result, continuous wearable sensor data offer the potential to predict clinical measures. However, interruptions in the data collection occur, which create a need for strategic data imputation. Objective The objective of this work is to adapt a data generation algorithm to impute multivariate time series data. This will allow us to create digital behavior markers that can predict clinical health measures. Methods We created a bidirectional time series generative adversarial network to impute missing sensor readings. Values are imputed based on relationships between multiple fields and multiple points in time, for single time points or larger time gaps. From the complete data, digital behavior markers are extracted and are mapped to predicted clinical measures. Results We validate our approach using continuous smartwatch data for n = 14 participants. When reconstructing omitted data, we observe an average normalized mean absolute error of 0.0197. We then create machine learning models to predict clinical measures from the reconstructed, complete data with correlations ranging from r = 0.1230 to r = 0.7623. This work indicates that wearable sensor data collected in the wild can be used to offer insights on a person's health in natural settings.
more » « less
Full Text Available
Temporal Analysis of Epidemiology indicators and Air Travel Data for Covid-19

Purohit, Sumit; Shelobolin, Filipp; Holder, Lawrence; Chin, George (July 2021, SIAM Conference on Applied and Computational Discrete Algorithms)

Coronavirus Disease 2019 (Covid-19) is an ongoing outbreak and the latest threat to global health. It is imperative to understand the implications of social interaction on Covid-19 indicators in order to help formulate policies and guidelines by governments and local authorities. We present a case-study of curating state-level Covid-19 indicators such as Active Cases, Deaths, Hospitalization Rate, etc. for the United States. We also curate open source domestic US air travel data and present its impact on Covid-19 indicators. We perform a time-series analysis of the dataset using Independent Temporal Motif (ITeM) to find weekly trends in the data. We publish the dataset and the results for further exploration by the research community.
more » « less
Full Text Available
Measuring the Relative Similarity and Difficulty Between AI Benchmark Problems

Pereyda, Christopher; Holder, Lawrence (February 2020, Workshop on Evaluating Evaluation of AI Systems, AAAI Conference on Artificial Intelligence)

There has been an explosion of challenge problems, algorithmic tests and datasets for evaluating AI systems. Yet no methodology exists to objectively measure either the collective difficulty of these problems or their similarity. This is an obstacle to creating more general AI systems. We pro- pose a theory for measuring the similarity between pair-wise problems. We evaluate this theory by utilizing a methodology based on a deep neural network to objectively measure these properties between test problems using foundational datasets. An implementation of these methods is then used to measure the similarity between well known datasets. Results show that the proposed measure successfully identifies the difficulty and similarity among problems. This can be used to ensure diversity in test suites used to evaluate AI systems.
more » « less
Full Text Available
Application-Specific Graph Sampling for Frequent Subgraph Mining and Community Detection

Purohit, Sumit; Holder, Lawrence; Choudhury, Sutanay (December 2017, IEEE International Conference on Big Data)

Graph mining is an important data analysis methodology, but struggles as the input graph size increases. The scalability and usability challenges posed by such large graphs make it imperative to sample the input graph and reduce its size. The critical challenge in sampling is to identify the appropriate algorithm to insure the resulting analysis does not suffer heavily from the data reduction. Predicting the expected performance degradation for a given graph and sampling algorithm is also useful. In this paper, we present different sampling approaches for graph mining applications such as Frequent Subgrpah Mining (FSM), and Community Detection (CD). We explore graph metrics such as PageRank, Triangles, and Diversity to sample a graph and conclude that for heterogeneous graphs Triangles and Diversity perform better than degree based metrics. We also present two new sampling variations for targeted graph mining applications. We present empirical results to show that knowledge of the target application, along with input graph properties can be used to select the best sampling algorithm. We also conclude that performance degradation is an abrupt, rather than gradual phenomena, as the sample size decreases. We present the empirical results to show that the performance degradation follows a logistic function.
more » « less
Full Text Available
GraphZip: Mining Graph Streams using Dictionary-based Compression

Packer, Charles; Holder, Lawrence B (August 2017, SIGKDD Workshop on Mining and Learning in Graphs (MLG))

A massive amount of data generated today on platforms such as social networks, telecommunication networks, and the internet in general can be represented as graph streams. Activity in a network’s underlying graph generates a sequence of edges in the form of a stream; for example, a social network may generate a graph stream based on the interactions (edges) between different users (nodes) over time. While many graph mining algorithms have already been developed for analyzing relatively small graphs, graphs that begin to approach the size of real-world networks stress the limitations of such methods due to their dynamic nature and the substantial number of nodes and connections involved. In this paper we present GraphZip, a scalable method for mining interesting patterns in graph streams. GraphZip is inspired by the Lempel-Ziv (LZ) class of compression algorithms, and uses a novel dictionary-based compression approach to discover maximally- compressing patterns in a graph stream. We experimentally show that GraphZip is able to retrieve complex and insightful patterns from large real-world graphs and artificially-generated graphs with ground truth patterns. Additionally, our results demonstrate that GraphZip is both highly efficient and highly effective compared to existing state-of-the-art methods for mining graph streams.
more » « less
Full Text Available
Deep Learning Approach to Link Weight Prediction

Hou, Yuchen; Holder, Lawrence B (May 2017, International Joint Conference on Neural Networks (IJCNN))

Deep learning has been successful in various domains including image recognition, speech recognition and natural language processing. However, the research on its application in graph mining is still in an early stage. Here we present Model R, a neural network model created to provide a deep learning approach to link weight prediction problem. This model extracts knowledge of nodes from known links’ weights and uses this knowledge to predict unknown links’ weights. We demonstrate the power of Model R through experiments and compare it with stochastic block model and its derivatives. Model R shows that deep learning can be successfully applied to link weight prediction and it outperforms stochastic block model and its derivatives by up to 73% in terms of prediction accuracy. We anticipate this new approach to provide effective solutions to more graph mining tasks.
more » « less
Full Text Available
Using Graphical Features To Improve Demographic Prediction From Smart Phone Data

Akter, Syeda; Holder, Lawrence B. (May 2017, SIGMOD Workshop on Network Data Analytics (NDA))

Demographic information such as gender, age, ethnicity, level of education, disabilities, employment, and socio-economic status are important in the area of social science, survey and marketing. But it is difficult to obtain the demographic information from users due to reluctance of users to participate and low response rate. Through automated demographics prediction from smart phone sensor data, researchers can obtain this valuable information in a nonintrusive and cost-effective manner. We approach the problem of demographic prediction, namely, classification of gender, age group and job type, through the use of a graphical feature based framework. The framework represents information collected from sensor networks as graphs, extracts useful and relevant graphical features, and predicts demographic information. We evaluated our approach on the Nokia Mobile Phone dataset for the three classification tasks: gender, age-group and job-type. Our approach produced comparable results with most of the state of the art methods while having the additional advantage of general applicability to sensor networks without using sophisticated and application-specific feature generation techniques, background knowledge and special techniques to address class imbalance.
more » « less
Full Text Available

Search for: All records